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Abstract 



C/3 I There is a large literature devoted to the problem of finding an op- 

O ■ timal ( min-cost) prefix-free code with an unequal letter-cost encoding 

alphabet of size. While there is no known polynomial time algorithm 

CNJ ■ for solving it optimally there are many good heuristics that all pro- 

^ I vide additive errors to optimal. The additive error in these algorithms 

■ usually depends linearly upon the largest encoding letter size. 

I This paper was motivated by the problem of finding optimal codes 

■ when the encoding alphabet is infinite. Because the largest letter cost 
1^ I is infinite, the previous analyses could give infinite error bounds. We 

■ provide a new algorithm that works with infinite encoding alphabets. 
I When restricted to the finite alphabet case, our algorithm often pro- 

■ vides better error bounds than the best previous ones known. 

Keywords: Prefix-Free Codes. Source-Coding. Redundancy. En- 

^ ■ tropy. 

H . 

- - ■ 1 Introduction 



Let S = {(Ti, CJ2. . . . , CTj} be an encoding alphabet. Word t/; G S* is a prefix of 
word w' € S* if w' = wu where u G S* is a non-empty word. A Code over 
S is a collection of words C = {wi, . . . , Wn}- Code C is prefix-free if for all 
i 3 Wi is not a prefix of Wj. See Figured! 

Let cost{w) be the length or number of characters in w. Given a set of 
associated probabilities pi,P2, ■ ■ ■ ,Pn > 0, "^^Pi = 1, the cost of the code is 
Cost{C) = ^'^=iCost{wi)pi. The prefix coding problem, sometimes known 
as the Huffman encoding problem is to find a prefix-free code over S of 
minimum cost. This problem is very well studied and has a well-known 
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Figure 1: In this example S = {a, 6}. The code on the left is {aaa, aa6, o6, 5} 
which is prefix free. The code on the right is {aaa, aab, ab, aaba} which is 
not prefix- free because aab is a prefix of aaba. The second row of the tables 
contain the costs of the codewords when cost{a) = 1 and cost{b) = 3. 
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Figure 2: Two min-cost prefix free codes for probabilities 2/6, 2/6, 1/6, 1/6 
and their tree representations. The code on the left is optimal for ci = C2 = 1 
while the code on the right, the prefix- free code from Figure [H is optimal 
for ci = 1, C2 = 3. 



0(tn log n)-time greedy-algorithm due to Huffman [14j (O(tn)-time if the pi 
are sorted in non-decreasing order). 

Alphabetic coding is the same problem with the additional constraint that 
the codewords must be chosen in increasing alphabetic order (with respect 
to the words to be encoded). This corresponds, for example, to the problem 
of constructing optimal (with respect to average search time) search trees 
for items with the given access probabilities or frequencies. Such a code can 
be constructed in 0{t'n?) time |16| . 

One well studied generalization of the problem is to let the encoding 
letters have different costs. That is, let fij G S have associated cost Cj. The 
cost of codeword w = (Ti-^Ui^ ■ ■ - t^ will be cost{w) = X]L=i Cij., i-e., the sum 
of the costs of its letters (rather than the length of the codeword) with the 
cost of the code still being defined as Cost{C) = Y17=i c-ost{wi)pi with this 
new cost function. 

The existing, large, literature on the problem of finding a minimal-cost 
prefix free code when the q are no longer equal, which will be surveyed 
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below, assumes that S is a finite alphabet, i.e., that t = |S| < oo. The 
original motivation of this paper was to address the problem when S is 
unbounded, which, as will briefly be described in Section [3] models certain 
types of language restrictions on prefix free codes and the imposition of 
different cost metrics on search trees. The tools developed, though, turn 
out to provide improved approximation bounds for many of the finite cases 
as well. More specifically, it was known [20l [230 that ^H{pi, ...,Pn) < OPT 
where H{pi, . . . ,pn) = — X]"=iP«logPi is the entropy of the distribution, c 
is the unique positive root of the characteristic equation 1 = Yl'i=i 2"'^'^* and 
OPT is the minimum cost of any prefix free code for those pi. Note that in 
this paper, logx will always denote log2a;. 

The known efficient algorithms create a code T that satisfies 

C{T)<-H{p,,...,pn) + f{C) (1) 
c 

where C(T) is the cost of code T, C = (ci, C2, • • • , q) and /(C) is some func- 
tion of the letter costs C, with the actual value of /(C) depending upon the 
particular algorithm. Since ^H[pi, . . . ,pn) < OPT, code T has an additive 
error at most /(C) from OPT. The /(C) corresponding to the different al- 
gorithms shared an almost linear dependence upon the value q = max(C), 
the largest letter cost. They therefore can not be used for infinite C. In this 
paper we present a new algorithmic variation (all algorithms for this prob- 
lem start with the same splitting procedure so they are all, in some sense, 
variations of each other) with a new analysis: 

• (Theorems [2] and [3]) For finite C we derive new additive error bounds 
/(C) which in many cases, are much better than the old ones. 

• (Lemma [9]) If C is infinite but dj = \{m | j < Cm < j -|- 1}| is bounded, 
then we can still give a bound of type ([T]). For example, if Cm = 
1 + [^^^^J , i.e., exactly two letters each of length i,= 1, 2, 3, . . ., then 
we can show that /(C) < 1 -|- 

• (Theorem S]) If C is infinite but di is unbounded then we can not 
provide a bound of type ([1]) but, as long as Yl'^i Cm2~'^'^'" < oo, we 
can show that 

V6>0, CiT)<il + e)-Hipi,...,pn) + f{C,e) (2) 

c 

^Note that if t = 2 with ci = C2 = 1 then c = 1 and this reduces to the standard 
entropy lower bound for prefix-free coding. Although the general lower bound is usually 
only explicitly derived for finite t, Krause [20] showed how to extend it to infinite t in 
cases where a positive root of 1 = X^i^i 2"'^'^' exists. 
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where f{C, e) is some constant based only on C and e. 

We now provide some more history and motivation. 

For a simple example, refer to Figure [2l Both codes have minimum cost 
for the frequencies {pi,P2,P3,Pi) = (|)|)|)|) but under different letter 
costs. The code {00, 01, 10, 11} has minimum cost for the standard Huffman 
problem in which of S = {0, 1} and ci = C2 = 1, i.e., the cost of a word is 
the number of bits it contains. The code {aaa, aab, ab, b} has minimum cost 
for the alphabet S = {a, b} in which the length of an "a" is 1 and the length 
of a "6" is 3, i.e., C = (1,3). 

The unequal letter cost coding problem was originally motivated by cod- 
ing problems in which different characters have different transmission times 
or storage costs [U [22l [I8l [27l [28] . One example is the telegraph channel 
[21 [ini ED] in which S = {•, — } and ci = 1, C2 = 2, i.e., in which dashes are 
twice as long as dots. Another is the (a, b) run-length-limited codes used in 
magnetic and optical storage [15^ [TT] , in which the codewords are binary and 
constrained so that each 1 must be preceded by at least a, and at most 6, O's. 
(This example can be modeled by the unequal-cost letter problem by using 
an encoding alphabet ofr = 6 — a + 1 characters {0^1 : A: = a, a + 1, . . . , 6} 
with associated costs {ci = a + i — 1}.) 

The unequal letter cost alphabetic coding problem arises in designing 
testing procedures in which the time required by a test depends upon the 
outcome of the test [19l 6.2.2, ex. 33] and has also been studied under the 
names dichotomous search [13] or the leaky s/iower problem [TT] . 

The literature contains many algorithms for the unequal-cost coding 
problem. Blachman [3|, Marcus [22], and (much later) Gilbert [10] give 
heuristic constructions without analyses of the costs of the codes they pro- 
duced. Karp gave the first algorithm yielding an exact solution (assuming 
the letter costs are integers); Karp's algorithm transforms the problem into 
an integer program and does not run in polynomial time [18]. Later ex- 
act algorithms based on dynamic programming were given by Golin and 
Rote [11] for arbitrary t and a slightly more efficient one by Bradford et. 
al. [5] for t = 2.. These algorithms run in n^^'^*^ time where q is the cost 
of the largest letter. Despite the extensive literature, there is no known 
polynomial-time algorithm for the generalized problem, nor is the problem 
known to be NP-hard. Golin, Kenyon and Young 12\ provide a polynomial 
time approximation scheme (PTAS). Their algorithm is mainly theoretical 
and not useful in practice. Finally, in contrast to the non-alphabetic case, 
alphabetic coding has a polynomial-time algorithm 0{t'n?) time algorithm 
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Karp's result was followed by many efficient algorithms \20\ [8l [71 123 1 [2]. 
As mentioned above, ^H{pi, . . . ,pn) < OPT; almoslHall of these algorithms 
produce codes of cost at most C{T) < ^Hivi, ■ ■ ■ ,Pn) + /(C) and therefore 
give solutions within an additive error of optimal. An important observation 
is that the additive error in these papers /(C) somehow incorporate the cost 
of the largest letter cj = max(C). Typical in this regard is Mehlhorn's 
algorithm [23J which provides a bound of 

cC{T) - H{pi, . . . ,Pn) < {I - Pi - Pn) + cat (3) 

Thus, none of the algorithms described can be used to address infinite al- 
phabets with unbounded letter costs. 

The algorithms all work by starting with the probabilities in some given 
order, grouping consecutive probabilities together according to some rule, 
assigning the same initial codeword prefix to all of the probabilities in the 
same group and then recursing. They therefore actually create alphabetic 
codes. Another unstated assumption in those papers (related to their def- 
inition of alphabetic coding) is that the order of the Cm is given and must 
be maintained. 

In this paper we are only interested in the general coding problem and 
not the alphabetic one and will therefore have freedom to dictate the orig- 
inal order in which the pi are given and the ordering of the Cm- We will 
actually always assume that pi > P2 > P2 > ■ ■ ■ and ci < C2 < C3 < • • • . 
These assumptions are the starting point that will permit us to derive better 
bounds. Furthermore, for simplicity, we will always assume that ci = 1. If 
not, we can always force this by uniformly scaling all of the Cj. 

For further references on Huffman coding with unequal letter costs, see 
Abrahams' survey on source coding [H Section 2.7], which contains a section 
on the problem. 

2 Notations and definitions 

There is a very standard correspondence between prefix-free codes over al- 
phabet T, and |S|-ary trees in which the child of node v is labelled with 
character am G S. A path from the root in a tree to a leaf will correspond to 
the word constructed by reading the edge labels while walking the path. The 

^As mentioned by Mehlhorn [23], the result of Cot [7] is a bit difTerent. It's a re- 
dundancy bound and not clear how to efficiently implement as an algorithm. Also, the 
redundancy bound is in a very different form involving taking the ratio of roots of multiple 
equations that makes it difficult to compare to the others in the literature. 
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tree T corresponding to code C = {wi, . . . ,Wn} will be the tree containing 
the paths corresponding to the respected words. Note that the leaves in the 
tree will then correspond to codewords while internal nodes will correspond 
to prefixes of codewords. See Figures [2] and O 

Because this correspondence is 1-1 we will speak about codes and trees 
interchangeably, with the cost of a tree being the cost of the associate code. 

Definition 1 Let C he a prefix free code over S and T its associated tree. 
Nt will denote the set of internal nodes of T. 

Definition 2 Set c to be the unique positive solution to 1 = Yll=i'^~'^^' - 
Note that ift< oo, then c must exists while ift = oo, c might not exist. We 
only define c for the cases in which it exists, c is sometimes called the root 
of the characteristic equation of the letter costs. 

Definition 3 Given letter costs Ci and their associated characteristic root 
c, let T he a code with those letter costs. If pi,p2, ■ . . ,Pn > is a probahility 
distribution then the redundancy of T relative to the pi is 

R{T;pi, . . . ,pn) = C{T) - -Hijpi, . . . ,p„). 

c 

We will also define the normalized redundancy to he 

NR(r;pi, ...,p^) = cR = cC{T) - H{pi, . . . ,pn). 
If the Pi and T are understood, we will write R{T) (NR(T) ) or even R 

(nr;. 

We note that many of the previous results in the literature, e.g., ([3]) from 
|23j , were stated in terms of NR. We will see later that this is a very natural 
measure for deriving bounds. Also, note that by the lower bound previously 
mentioned, C(T) > ^H{pi, . . . ,pn) for all T and pi, so R{T;pi, . . . ,pn) is a 
good measure of absolute error. 

3 Examples of Unequal-Cost Letters 

It is very easy to understand the unequal-cost letter problem as modelling 
situations in which different characters have different transmission times or 
storage costs [H [22l [181 [271 EH] . Such cases will all have finite alphabets. It 
is not a-priori as clear why infinite alphabets would be interesting. We now 
discuss some motivation. 
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In what follows we will need some basic language notation. A language 
£ is just a set of words over alphabet S. The concatenation of languages A 
and B is AB = {ab \ a £ A,b £ B}. The i-fold concatenation, £*, is defined 
by = {A} (the language containing just the empty string), C} = L and 
D = CC-^. The Kleene star of £, is £ = U^o ■ 

We start with cost vector C = {1,2,3, ... ,} i.e, Vm > 0, Cm, = m. An 
early use of this problem was [24J. The idea there was to construct a tree 
(not a code) in which the internal pointers to children were stored in a linked 
list. Taking the rrfi^ pointer corresponds to using character am- The time 
that it takes to find the irfi^ pointer is proportional to the location of the 
pointer in the list. Thus (after normalizing time units) Cm = m. 

We now consider a generalization of the problem of 1-ended codes. The 
problem of finding min-cost prefix free code with the additional restriction 
that all codewords end with a 1 was studied in [3l E] with the motivation of 
designing self-synchronizing codes. One can model this problem as follows. 
Let £ be a language. In our problem, 

C = {w £ {0, 1}* I the last letter in t(; is a 1}. 

We say that a code C is in £ if C C £. The problem is to find a minimum 
cost code among all codes in £. 

Now suppose further that £ has the special property that £ = Q* where 
Q is itself a prefix-free language. Then every word in £ can be uniquely 
decomposed as the concatenation of words in Q. If the decomposition of 
w £ C is w = qiq2 . . .Qr ior qi £ Q then cost{w) = X]i=i cost{qi). We can 
therefore model the problem of finding a minimum cost code among all codes 
in £ by first creating an infinite alphabet Sg = {aq \ q £ Q} with associated 
cost vector Cq (in which the length of aq is cost{q)) and then solving the 
minimal cost coding problem for Sq with those associated costs. For the 
example of 1-ended codes we set Q = {1,01,001,0001, . . .} and thus have 
C = {1, 2, 3, . . . , } i.e, an infinite alphabet with Cm, = 1 for all m > 1. 

Now consider generalizing the problem as follows. Suppose we are given 
an unequal cost coding problem with finite alphabet S = {ai, . . . ,at} and 
associated cost vector C = (ci, . . . , q). Now let S' C S and define 

£ = = {w £ S* I the last letter in w is in S'}. 

Now note that £ = D* where D = (T, — S')*S' is a prefix-free language. 
We can therefore model the problem of finding a minimum cost code among 
all codes in £ by solving an unequal cost coding problem with alphabet S^i 
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and Cd- The important observation is that 



dj = \{d £ TiD \ cost{d) = j}\ 



the number of letters in S^) of length j, satisfies a linear recurrence relation. 
Bounding redundancies for these types of C will be discussed in Section [UJ 
Case 4. 

As an illustration, consider T, = {1, 2, 3} with C = (1, 1, 2) and S' = {1}; 
our problem is find minimal cost prefix free codes in which all words end 
with a 1. £ = {1,2,3}*{1} = D*, where D = {2,3}*{1}. The number of 
characters in S/j with length j is 



and, in general, = f^j+i + di, so di = Fi, the Fibonacci numbers. 

We conclude with a very natural C for which we do not know how to 
analyze the redundancy. In Section [6l Case 5 we will discuss why this is 
difficult. 

Let C be the set of all "balanced" binary word^, i.e., all words which 
contain exactly as many O's as I's. Note that C = D* where P is the set of 
all non-empty balanced words w such that no prefix of w is balanced. Let 

ij = {{w £ C \ cost{w) = dj = \{w £ D \ cost{w) = j}\ 

and set L{z) = X^^q^"-^" ^i^) — Z^^o^"-^" *° their associated 
generating functions. If £ = D*, then standard generating function rules, 
see e.g., [25], state that L(z) = (1 — D(^z))^^ . Observe that /„ = if n is 
odd and In = („/2) for n even, so 



d\ = 1, d2 = 1, ds = 2, d4 



3, ds = 5 




and 



oo 



djz^ = D{z) = 1 - - 4z2. 



n=l 

This can then be solved to see that, for even n > 0, (i„ = 2C„/2-i where 
Ci = is the Catalan number. For n = or odd n, dn = 0. 
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Figure 3: The first splitting step for a case when n = 6ci = l, C2 = 2, 
C3 = 3 and the associated prehminary tree. This step groups pi,P2,P3 as 
the first group, P3,P4 as the second and by itself. Note that we haven't 
yet formally explained yet why we've grouped the items this way. 




Figure 4: In the second split, pi is kept by itself and P2,P3 are grouped 
together. 




Figure 5: After two more splits, the final coding tree is constructed. The 
associated code is {aiai, (y\02(y\^ oxo^o^^ o"20"i, 02^2^ fs} 



4 The algorithm 

All of the provably efficient heuristics for the problem, e.g., [20l [HI 171 [23| [2] . 

use the same basic approach, which itself is a generalization of Shannon's 
original binary splitting algorithm |26j . The idea is to create t bins, where 
bin m has weight 2~^^"^ (so the sum of all bin weights is 1). The algorithms 
then try to partition the probabilities into the bins; bin m will contain 
a set of contiguous probabilities PimiPim+ii ■ ■ ■ iVr^n whose sum will have 

^This also generalizes a problem from [2T which provides heuristics for constructing a 
min-cost prefix-free code in which the expected number of O's equals the expected number 
of I's. 
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total weight "close" to 2~^'^™. The algorithms fix the first letter of all the 
codewords associated with the pk in bin m to be am- After fixing the first 
letter, the algorithms then recurse, normalizing Pi^,pi,„+i, ■ ■ ■ ,Prm to sum 
to 1, taking them as input and starting anew. The various algorithms differ 
in how they group the probabilities and how they recurse. See Figures El H] 
and [5] for an illustration of this generic procedure. 

Here we use a generalization of the version introduced in [23]. The 
algorithm first preprocesses the input and calculates all Pk = pi +P2 + • • • +Pk 
(Pq = 0)and = Pi + P2 + ■ ■ ■ + Pk-i + Note that if we lay out the 
Pi along the unit interval in order, then can be seen as the midpoint 
of interval pi. It then partitions the probabilities into ranges, and for each 
range it constructs left and right boundaries Lm, Rm- Pk will be assigned to 
bin m if it "falls" into the "range" [Lm,Rm)- 

If the interval pk falls into the range, i.e., Lm < Pk-i < Pk < R-m then 
Pk should definitely be in bin m. But what if pk spans two (or more) ranges, 
e.g., Lm < -Pfc-i < Rm < Pk^ To which bin should pk be assigned? The 
choice made by [23] is that is assigned to bin m if = pi +p2 + ■ ■ ■ +Pk/2 
falls into [Lm, Rm), i-e., the midpoint of pk falls into the range. 

Our procedure C0DE{1, r, U) will build a prefix-free code for pi, . . . ,pr 
in which every code word starts with prefix U. To build the entire code we 
call CODE{l,n, X), where A is the empty string. 

The procedure works as follows (Figure [6] gives pseudocode and Figures 
[71 [8] and [9] illustrate the concepts): 

Assume that we currently have a prefix of U assigned to pi, ... ,pr. Let 
V be node in the tree associated with U. Let w{v) = 'Yl\=iPk- 

(i) If / = r then word U is assigned to pi. Correspondingly, f is a leaf in 
the tree with weight w{v) = pi. 

(ii) Otherwise let L = Pi and R = Pr. Split R — L = w{v) into t ranged 
as follows. 

m— 1 m 

Vl<m<t, = L + (i? - L) ^ 2"'^^% Rm = L + {R - L)^2-''^\ 

i=l i=l 

Insert Pk, I ^ k < r bin m if G [Lm, Rm)- Bin ^ will thus contain the 

Pk in I^{v) = {k [ Lm < Sk < Rm}- 

We now shift the items pk leftward in the bins as follows. Walk through 
the bins from left to right. If the current bin already contains some pk, 
continue to the next bin. If the current bin is empty, take the first pk that 
appears in a bin to the right of the current one, shift p^ into the current 

*In the description, t is permitted to be finite or infinite. 
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CODE{l,r, U)- 

{Constructs codewords Ui, J/^+i, . . . ,Ur for pi,pi+i, . . . ,Pr- 
U is previously constructed common prefix of Ui, f/^+i, . . . ,Ur.} 
Ifl = r 

then codeword Ui is set to be U. 
else {Distribute piS into initial bins I^} 
L = Pi_i; R = Pr]w = R~L 

^m, let L^ = L + w Yl'^i^ and Rm = Lm + ^2"^'^-. 
set = {A; I L„ < Sjfc < Km} } 

{Shift the bins to become final Im- Afterwards, 

all bins > M are empty, all bins < M non-empty 
and Vm < M, = {1^, ■ ■ ■ rm}} 

{shift left so there are no empty "middle" bins.} 
M = 0;k = l; 
while A; < r do 

M = M + 1; 

Im = k; ru = uiax(^{k} \J{i > k\ie /|^}^ ; 
A; = rM + 1; 

{// all Pi 's are in first bin, shift Pr to 2^^ bin } 
if ri = r then 

M = 2; 

n = r - 1; I2 = r2 = r; 

for m = 1 to M do 

CODEiUrmiUam); 

Figure 6: Our algorithm. Note that the first step of creating the was 
written to simplify the development of the analysis. In practice, it is not 
needed since /j^ is only used to find max{i > k \ i £ I^} and this value can 
be calculated using binary search at the time it is required. 



11 



Pi P2 P3 Pi P^ Pg 



















si S2 
























i?; = 1 



Figure 7: The first step in our algorithm's spHtting procedure, n = 6. 
Li = YTrn=i'^~'^^"^ ■ Note that even though only the first 5 Li are shown, 
there might be an infinite number of them (if t = 00). Note too that, for 
<i, Li = Ri_i. 
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Figure 8: The splitting procedure performed on the above example creates 
the bins on the left. The shifting procedure then creates the Im on the 
right. 
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Figure 9: An illustration of the recursive step of the algorithm. pj,pj-|_i,pj4.2 
have been grouped together. In the next splitting step, the interval operated 
on has length w = pi+ pi+i + pi+2- 
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bin and walk to the next bin. Stop when ah pk have been seen. Let Im{v) 
denote the items in the bins after this shifting. 

Note that after performing this shifting there is some M[v) such that 
ah bins m < M{v) are nonempty and ah bins m > M{v) are empty. Also 
notice that it is not necessary to actually construct the /m(^) first. We 
only did so because they will be useful in our later analysis. We can more 
efficiently construct the Im{v) from scratch by walking from left to right, 
using a binary search each time, to find the rightmost item that should be 
in the current bin. This will take 0{M(v) log(/ — r)) time in total. 

We then check if all of the items are in Ii{v). If they are, we take pr and 
move it into hiv) (and set M{v) = 2). 

Finally, after creating the all of the Im{v) we let Im = min{/c E Im{v)} 
and Vm = max{/c G Im{v)} and recurse, for each m < M(v) building 
CODE{lm,rm,Uam) 

It is clear that the algorithm builds some prefix code with associated 
tree T. As defined, let Nt be the set of internal nodes of T. Since every 
internal node of T has at least two children, YIvgNt -^(^) < 2n — 1. 

The algorithm uses 0(1) time at each of its n leaves and 0{M{v) logn) 
time at node v. Its total running time is thus bounded by 

n+ log nM(t;) = 0(n log n) 

v&Nt 

with no dependence upon t. 

For comparison, we point out the algorithm in |23j also starts by first 
finding the I^{v). Since it assumed t < oo, its shifting stage was much 
simpler, though. It just shifted pi into the first bin and pr into the t bin 
(if they were not already there). 

We will now see that our modified shifting procedure not only permits 
a finite algorithm for infinite encoding alphabets, but also often provides a 
provably better approximation for finite encoding alphabets. 

5 Analysis 

In the analysis we define w^^iv) = 'Ekei^{v)Pk, Wm{v) = Y.keim{v)Pk- Note 

that W{v) = Y!m=l W*m{v) = Em=l Wm{v) = Ek=lPk- 

We first need three Lemmas from |23) . The first was proven by recursion 
on the nodes of a tree, the second followed from the definition of the splitting 
procedure and the third from the second by some algebraic manipulations. 
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Lemma 1 Let T be a code tree and Nt be the set of all internal nodes 
of T . Then 

1. The cost C{T) of the code tree T is 



C{T) = ^ ^^Cm- Wmiv) 
vGNt m=l 



2. The entropy H{pi,p2, ■ ■ ■ ,Pn) is 

H(pi,P2, . . . ,Pn) = ^ W{v) ■ H 



Wl{v) W2{v) 



w{v) ' w{v) 



Lemma [T] permits expressing the normalized redundancy of T as 
NR{T) = c-C{T)-H{pi,p2,...,Pn) 

Wmiv) 



w(v) 

m=l ^ ' 



log 2^^"'" + log ■ 



Set 



E{y^ m) 



Wmiv) 



w[v} 



log 2"""' + log 



Wmjv) 

wiv) 



Note that NR(T) = ^^g^y^ wiv) iYlm=i Eiv,rnj) . For convenience we will 
also define 



E*iv^ m) 



Wmiv) 

wiv) 



w* (v) 

log2^^'" +lo- 



wiv) J ' 



vGNt \m=l 



[v,m) 



The analysis proceeds by bounding the values of NR*(T) and NR(r) — 
NR*(r). 

Lemma 2 '23'^ (note: In this Lemma, the pi can be arbitrarily ordered J 
Consider any call CODEil,r,U) with I < r. Let node v correspond to the 
word U. Let sets I^; • • • defined as in procedure CODE. 

a) ///;; = 0, then w^^iv) = 0. 



^slightly rewritten for our notation 



14 



b) If Im = {e}- then wl,{v) = Pe- 

c) // \I*-^\ > 2. Let e = min/j^ and f = max/j^. 



ui(t)) — 2iii(t; 

If m = t (note that this case requires t < co) then 



Ifm=l, then < 2'^^' + < 2 • 2-^"^ 

•' ' w(v) — 2w(v) — 



If2<m<t, then < 2-^=^™ + < 2 • 2"^'=" 

— ' w(v) — 2w(v) — 



Lemma 3 }23^ (note: In this Lemma, the pi can be arbitrarily ordered J 
In case (c) of Lemma\^ E*{v,m) < ^-^^~y- 
Furthermore, if m = \ then E*{v,m) < 
while if m = t, then E*(v,m) < 

J ! \ ) / — W{V) 

Corollary 4 // the pi are sorted in nondecreasing order then in case (c) of 
Lemma\^ 

if m = 1, E*(v,m) < -^f^, while if m > 1, then E*(v,m) < ^t\. 

J ! V 5 y — W{V) ' ' \ t J — W(V) 

Lemma 5 

NR - NR* < c(c2 - ci) J2Pi 

where 

A = {i \ i is right shifted by the algorithm at some step}. 

Note: pi can never be right shifted, so J^ieAPi — ^ ~ Pi- 
Proof: Define 

t t 
X{v) = w{v)E{v,ra) and X*{v) = w{v)E* (v,m) 

m=l m=l 

Note that NR = J^veNr ^(^) = Edsa^t ^*(^) ^^^^ ^ ^^^^ 

compare X*{v) and X{v). If no shifts were performed while processing v, 
then X*{v) = X(v) and there is nothing to do. We now examine the two 
mutually exclusive cases of performing left shifts or performing a right shift. 

Left shifts: 



Every step in our left-shifting procedure involves taking a probability out of 
some bin m and and moving it into some currently empty bin r < m. Let 
w'^{v) be the weight in bin m before that shift and p be the probability of 
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the item being shifted. Note that the original weight of bin r was w'^{v) = 
while after the shift, bin r will have weight p and bin m weight w'j^{v) — p. 
We use the trivial fact 

Vp, g>0, plogp + qlogq < {p + q)log{p + q) (4) 

Setting q = 'w'^{v) — p in @ implies 

plog -p- + {w'^{v) - p) log < 1^;, log 

w[y) w{v) w{v) 

Furthermore, the fact that the Cj are nonincreasing implies 

p log 2^^'- + {w'^ {v)-p) log 2^^™ <wUv) log 2^"=™ (5) 
Combining the two last equations gives that 



p {log T^^- + log + {w'^(y) - p) {log 2^^™ + log ■ 



w{v) 



IS 



< w'^{v) (^log2'=^™ +log 



w{v) 

Since moving from X*{v) to X[v) involves only operations in which prob- 
abilities are shifted to the left into an empty bucket, the analysis above 
implies that X{v) < X*{v). 

Right shifts: 

Consider node v. Suppose that all of the probabilities in v fall into /J" with 
II = {pe, ■ ■ ■ ,Pf} and e ^ f. Since pj starts in bin 1, pe must be totally 
contained in bin 1, so pe ^ 2~'^^^w{v). The algorithm shifts pf to the right 
giving h = h — {pf} and h = {pf}- The pi are nonincreasing so p/ < Pe- 

E{v, 2) = ^ flog 2-^ + log < ^(cc2 - cci). 

W{V) \ W[V) J W{V) 

Also E{v, 1) < E*{v, 1). Thus 
t 

w{v) ""^^ E{v,'m) = w{v)E{v,l) + w{v)E{v,2) 

< w{v)E* {v,l) + Pf{cC2 — CCl) 

Once a Pf is right-shifted it immediately becomes a leaf and can never 
be right-shifted again. 
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Combining the analyses of left shifts and right shifts gives 



NR 



Xiv) < X*iv) + c(c2 - ci) = NR* + c(c2 - ci)YPi- 



□ 



Lemma 6 

NR* < 2(1 -pi) + ^ Y w{v)E*{v,m). 

l-rs,{«)|=i 

Proof: We evaluate NR* by partitioning it into 

NR* = ^ Y w{v)E*{v,m)+ Y Y w{v)E*{v,m). (6) 

V(i^Nx i<m<t vGNx i<m<t 

We use a generalization of an amortization argument developed in [23] to 
bound the first summand. From Corollary [J] we know that if |/m(i')| > 2 
with e = min/j^ and / = max/^ then w{v)E* {v,m) is at most (a) pj or 
(b) 2pe, depending upon whether (a) m = 1, or (b) m > 1. 

Suppose that some pi appears as 2pi in such a bound because i = 
min/^(t;), i.e., case (b). Then, in all later recursive steps of the algorithm 
i will always be the leftmost item in bin 1 and will therefore not be used in 
any later case (a) or (b) bound. 

Now suppose that some pi appears in such a bound because i = max /^(f ), 
i.e., case (a). Then in all later recursive steps of the algorithm, i will always 
be the rightmost item in the rightmost non-empty bin. The only possibility 
for it to be used in a later bound is if becomes the rightmost item in bin 1, 
i.e., all of the probabilities are in /^(f). In this case, pi is used for a sec- 
ond case (a) bound. Note that if this happens, then pi is immediately right 
shifted, becomes a leaf in bin 2, and is never used in any later recursion. 

Any given probability pi can therefore be used either once as a case (b) 
bound and contribute 2pi or twice as a case (b) bound and again contribute 
2- Pi. Furthermore, pi can never appear in a case (a) or (b) bound because, 
until it becomes a leaf, it can only be the leftmost item in bin 1. Thus 

Y Y w{v)E*{v,m)<2{l-p^). (7) 

V^J^frp l<m<t 
\I*-aM\>2 

□ 
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Note: In Melhorn's original proof f23f the value corresponding to the RHS of 
^ was (1 — Pi — Pn)- This is because the shifting step of Mehlhorn's algorithm 
guaranteed that \I^{v)\ ^ and thus there was a symmetry between the analysis of 
leftmost and rightmost. In our situation t might be infinity so we can not assume 
that the rightmost non-empty bin is t and we get 2(1 — pi) instead. 

Combining this Lemma with Lemma [5] gives 

Corollary 7 

NR<2(l-pi) + c(c2-ci)^p, + ^ w{v)E*{v,m). 

We will now see different bounds on the last summand in the above 
expression. Section [6] compares the results we get to previous ones for dif- 
ferent classes of C. Before proceeding, we note that any pi can only appear 
as I^{v) = {pi} for at most one {m,v) pair. Furthermore, if does appear 
in such a way, then it can not have been made a leaf by a previous right 
shift and thus pi A. 

We start by noting that, when t < oo our bound is never worse than 1 
plus the old bound of (1 — pi — p„) + cq stated in 

Theorem 1 If t < oo then 

NR<2(l-pi) + cct 
Proof: If I'^{v) = {pi\ then 'w{v)E* {v,m) < piCCm so 

^ ^ w{v)E*{v,m) < ^ ^ PiCCm < CCt^^Pi. 
veNr i<"i<* v^Nt i<"i<t i^A 

The theorem then follows from Corollary [71 □ 
For a tighter analysis we will need a better bound for the case = 1. 

Lemma 8 (a) Let v G Nt- Suppose i is such that i £ Imi'^)- then 

Pi < o 1 



w(v) ^ 2^''^J 

j=ra 

(b) Further suppose there is some m' > m such that ^ 0. Then 

m'-l 



I I L -| i ( fc J. ^ 

, , -2.y J-<3. y ^ 

w(v) ~ ^ l"-"] ~ ^ 2"-"^ 

j=m j=m 
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Proof: Consider the call C0DE{1, r, U) at node v. The fact that i G 
implies L + Y1Y=^ = Lm < Si- To prove (a) just note that 

t 

Si + ^ = Pi<Pr = R = L + w{v) J2 2" 



Pi n ^ 1-, 1-, T . A \ ^ 2-"^=' 

i=l 



So f < W{v) Y!j=m 

To prove part (b) let i! £ I*^, . Then 



, P 

s 



+ j = Pi<Si><L + w{v) J2 2" 



So y < w{v)Y^Y=m2^- '^^^ final inequality follows from the fact that 

Cm'-l < Cm- □ 

Definition 4 ^ei = 2^^™ ^^Lm 2"'"''' and /? = sup{/?m | 1 < m < t} 

We can now prove our first improved bound: 

Theorem 2 If (5 < oo then 

NR < 2(1 - pi) + max(c(c2 - ci), 1 + log/3) 

Proof: Note that using Definition S] and Lemma [8|^a) we can bound the last 
summand in Corollary [7] as 

w{v)E*{m,v) = flog 2^'^™ + log 

\ w\v) 

< <(^)log(2--2^2— A 

\ i=m / 

< <(7;)(1 + log/3) 

If w*^{v) = {i} then i was not a leaf in any previous step and therefore could 
not have been right shifted, so i ^ A. Thus 

Yl wiv)E*iv,m)< {I + log (3) YPi- 

vGNt i<"i<t ii^A 

□ 

This immediately gives an improved bound for many finite cases because, 
if t < cx), then f3m = 2"""^ T!i=m 2"'"'' < t - m + 1 so /? < t. Thus 
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Theorem 3 If t is finite then 

NR < 2(1 -pi) +max(c(c2 - ci), 1 + logt) 
Definition 5 For all j > 1, set 

dj = \{i \j<Ci <j + l}\. 

This permits us to give another general bound that also works for many 
infinite alphabets. 

Lemma 9 If dj = 0(1), then NR = 0(1). In particular, ifVj, dj < K 
then (3 < ^-2-'^ /™™ Theorem\^ 

K 



NR < 2(1 - pi) + max c(c2 - ci), 1 + c + log 



1 - 2- 

K 



Furthermore, if all of the ci are integers, then {5 < i_2-c cind 

K 



NR < 2(1 - pi) + max c(c2 - ci), 1 + log 



1 - 2- 



Proof: Since ci = 1 we must have 2 < 1. Thus, for all m > 1, if 
^ < Cm < ^ + 1 then 



fim = 2^'^™ 5] 2- 



oo 

-C3 



2^K 



1 - 2- 



which is independent of m and i. The analysis when the q are all integers 
is similar. □ 

For general infinite alphabets we are not able to derive a constant re- 
dundancy bound but we can prove 

Theorem 4 If C is infinite and X]m=i Cm2~^'^'" < c«, then, for every e > 

i2<e-i/(pi,...,p„) + /(C,e) (8) 
c 
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where f{C,e) is some constant based only on C and e. Note that this is 
equivalent to stating that 

C{T) < (l + e)OPr + /(C,e) 

Proof: We must bound the 



v,m) 



l-rm(")l=i 



term from the right hand side of Corollary [71 Recall that — 1 means 

that 3i such that I^{v) = {i}, i.e., w'^{v) = pi and thus w{v)E* {v,m) < 

Pi CCfn . 

Let Nf: be a value to be determined later and = max{m | Cm < iVe}- 
Since no probability appears more than once in the sum we can write 



E E w{v)E*{v,m) < cN^. 



IA*7l(")l = l 

To analyze the remaining cases, fix v. Consider the set of indices 

My = {m I (m > m-e) and = 

Sort these indices in increasing order so that My = {mi, m2, . . . , rrir} for 
some r with rui < 1712 < • • < rur. Let ij be such that Im i^) = {Pij}- Thus 



[v,m} 



me <m 
l^m{")l = l 



Pij CCn 



Lemma [8] and the fact that the Cm are non-decreasing then gives 



'YPijCCmj < Cw{v) 
3=1 



r-1 



m=mi 
00 



< 3ci(;(t') Cm,2 

m>m£ 
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We are given that Ylm=i '^'^"^ converges. Thus g{mf) | as rrie ^ oo 
where g{x) = J2m>x Cm'^''"'""- 

Note that as A^^^ increases, increases. Given e, we now choose to 
be the smahest value such that g{mf^) < |. Note that A^^ is independent of 

V. 

Combine the above bounds: 

w{v)E*{v,m) = Y Yl wiv)E*{v,m)+ Y Yl ^{""Wiv, 

y^J\[j. l<m<t ^(Zj\[j, l<m<me y^J\[j. me<m 

l-r;^{")l=i 

< cN,+ Y lcw{v) 

Recall from Lemma [1] and the fact that Vm, Cm > 1, 
t t 
C{T) = ^ Y^^"^' ^™('") - X] X] ^™-('") = X] 

vGNt m=l v&Nt m=l vGNt 

Thus, we have just seen that 



Y Y w{v)E*{v,m) <cN, + -cC{T). 

y^J\[j. l<m<t 

Plugging back into Corollary [7] gives 



cC{T) - H{pi, ...,pn)<2{l- pi) + c(c2 - ci) + cN, + |cC(r) 
which can be rewritten as 

C{T)--^-H{pi,...,pn) < -^-(2(l-pi)+c(c2-ci)+ciV,) 



We may assume that e < 1/2, so 1 + e > Thus 

^ 2 



CiT)-{l + e)-Hipi,...,pn)<fiC,e) 
c 

where 

/(C,e) = ^(- + (c2-ci) + iVj. (9) 
6 c 

This can then be rewritten as 

R = C{T)--H{pi,...,pn) < e-H{pi,...,pn) + f{C,e) 
c c 

< eOPT + f{C,e) 
proving the Theorem. □ 
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6 Examples 



We now examine some of the bounds derived in the last section and show 
how they compare to the old bound of {1 — pi — pn) + cct stated in ([3]). 
In particular, we show that for large families of costs the old bounds go to 
infinity while the new ones give uniformly constant bounds. 

Case 1: Ca = (ci, C2, . . . , Q-i, a) with a ] oo. 

We assume t > 3 and all of the q, i < t, are fixed. Let c^"^ be the root 
of the corresponding characteristic equation 1 = 2~'^°^ + X^*Z|{ c~'^'^\ Note 
that c(°) i c where c is the root of 1 = Ya=i c'""" ■ Let {NRa) Ra be the 
(normalized) redundancy corresponding to Ca- 
For any fixed a, the old bound ([3]) would give 

A^i?a< (l-pi-Pn) + c(")a, R^ < ^^—P^—Pl^ + a, 

the right hand sides of both of which tend to cxd as a increases. Compare 
this to Theorem [3] which gives a uniform bound of 

NRa < 2(l-pi)+max(c(")(c2-ci),l+logt) 

< 2(1 -pi) +max (c^'''-'-\c2 - ci), 1 + logt 



and 

^ ^NRa ^ 2(l-pi)+max(c(^'-i)(c2-Ci),l + logt) 
Ra S — rr' — • 

For concreteness, we examine a special case of the above. 

Example 1 Let t = 3 with ci = C2 = 1 and C3 = a > 1. The old bounds ^ 
gives an asymptotically infinite error as a —> cxd. The bound from Theorem 
IE is 

NRa < 2(1 -pi) + max(c(°)(c2 - ci), 1 + logt) < 3 + log3 

independent of a. Since c^"^ > c = 1 we also get 

NRa 

Case 2: A finite alphabet that approaches an infinite one. 

Let C be an infinite sequence of letter costs such that there exists a K > 

satisfying for all j, dj = \{i \ j < Ci < j}\ < K. Let c be the root of 



23 



the characteristic equation 1 = X^i^i 2"'^'^'. Let S^*^ = {ai, . . . , at} and its 
associated letter costs be C^*) = {ci,...,ct}. Let c^*) be the root of the 
corresponding characteristic equation 1 = Yll=i 2"^^^' and (NRt) Rt be the 
associated (normahzed) redundancy. Note that c^*) t c as t increases. 

For any fixed t, the old bound ^ would be NRt < {1 —pi — Pn) + c^^^Ct 
which goes to oo as t increases. Lemma [9] tells us that 



/3W = max 2^'''^™y2^'*'^' < - 

l<m<t ^ 1 



< 



1— c(*) Y 2^'^''^'' 



SO, from Theorem [2] and the fact that Vt, c*-^-* < c^*) < c, we get 
NRt < 2(1 - pi) + max c(c2 - ci), 1 + c + lo, 



^ 1 _ 2-c(2) 

Note that if all of the Cm are integers, then the additive factor c will vanish. 

Example 2 Let C = {1,2,3, .. .). i.e., Cm — "rn. The old bounds gives 
an asymptotically infinite error as a —> oo. 

For this case c = 1 and K = 1. c^^^ is the root of the characteristic 
equation 1 = 2'^ + 2'^". Solving gives 2'"^^^ = and c^^) = 1 - 

log(-v/5 — 1) ~ 0.694 .... Plugging into our equations gives 

K 



NRt < 2(1 -pi) + max (^c(c2 - ci), 1 + log 

= 2(1 -pi) + l + log ( — '^—] < 4.388 



1 - 2- 



(2) 



3- a/5 
and 

NRt NRt 
^^ = ^^^^ 6.232. 

Case 3: An infinite case when dj = 0(1). 
In this case just apply Lemma [9] directly. 

Example 3 Let C contain d copies each of i = 1,2,3,..., i.e., Cm = 1 + 
L^^^^J- -^ote that K = d. If d = 1, i.e., Cm = m, then c = K = 1 and 

i? = NR<2(l-pi) + 2. 



If d> I then A{x) = Em=i Cm^"" = rh- solution a to A{a) = 1 
a = ^1^, so c = — logo; = log{d + 1). The lemma gives 

NR < 2(l-pi)+ ( 1 + log (-^—] ] < 3+log((i+l), R<1~ ^ 



IS 



l-2-y J - " - log((i + 1) 
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Case 4: dj are integral and satisfy a linear recurrence relation. 
In this case the generating function A(z) = Yl'jLi djZ^ = X]m=i '^^"^ '^^^ 
written as A{z) = where P{z) and Q{z) are relatively prime polyno- 
mials. Let 7 be a smallest modulus root of Q{z). If 7 is the unique root 
of that modulus (which happens in most interesting cases) then it is known 
that dj = (which will also imply that 7 is positive real) where d 

is the multiplicity of the root. There must then exist some q < 7 such that 
A(a) = 1. By definition c = — log a. Furthermore, since a < 7 we must have 
that Yl'jLi djja^ = X^m=i Cm^'^'" also converges, so Theorem H] applies. 
Note that 

00 



j=x 
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implying 

h~He) = log^/„ 1/e + 0(loglogl/e) 

where we define 

h~^{e) = max{x | h{x) < e, h{x — 1) > e}. 

Working through the proof of Theorem [4] we find that when the Cm are all 
integral, 

Vm', g(m') = ^ c™2-^^'" = ^ 0^0^=™ 

m>m' ra>m' 

Recall that = max{m | Cm < N^}. Then g{m^) < h{N^). Since ^(mg) < 
e/6, 

iVe < h-^e/G) = log^/„ 1/e + 0(loglog 1/e) 
and thus our algorithm creates a code T satisfying 

C(r) - OPT < eOPT + log^/„ 1/e + 0(log log 1/e). (10) 
Example 4 Consider the case where dj = Fj, the j^^ Fibonacci number, 



Fi = 1, F2 = 1, -F3 = 2,.... It's well known that A{z) = X^fci dj 



j=i a-jz - Y^^-^ 

and Fj = - — ^ '^'^ where (p = ^^2^ ■ Thus dj = + where "y = cp"^ 

and |e„| < 1. Solving A{a) = 1 gives a = \/2 — 1 ~ .4142... (and c = 
— log a = 1.2715. . ilO\) gives a bound on the cost of the redundancy of 
our code with I = « 1-492 .... 
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Case 5: An example for which there is no known bound. 
An interesting open question is how to bound the redundancy for the case 
of balanced words described at the end of Section [3l Recall that this had 
dj integral with dj = for j = and odd j and for even j > 0, dj = 
2Cj/2-i where Cj = is the Catalan number. It's well known that 

J:T=oCjX^ = M^-VT^) so 



A{x) = Y^x^^ = Y^ djx^ = 1 - \/l - 4x2. 

m=l j=l 

Solving for A{a) = 1 gives a = and c = — log a = 1. On the other hand, 

oo oo oo y , • 1 \\ 2 

E = Ei".-' = 2E ( 'Ji ') = 

m=l j=l j=l ^ ^ V J- 

SO this sum does not converge when x = 1/2. Thus, we can not use Theo- 
rem |4] to bound the redundancy. Some observation shows that this C does 
not satisfy any of our other theorems either. It remains an open question 
as to how to construct a code with "small" redundancy for this problem, 
i.e., a code with a constant additive approximation or something similar to 
Theorem HI 



7 Conclusion and Open Questions 

We have just seen 0(n log n) time algorithms for constructing almost op- 
timal prefix- free codes for source letters with probabilities pi, ■ ■ ■ ,Pn when 
the costs of the letters of the encoding alphabet are unequal values C = 
{ci,C2, . . .}. For many finite encoding alphabets, our algorithms have prov- 
ably smaller redundancy (error) than previous algorithms given in [20^ |8l [23l 
[2]. Our algorithms also are the first that give provably bounded redundancy 
for some infinite alphabets. 

There are still many open questions left. The first arises by noting 
that, for the finite case, the previous algorithms were implicitly construct- 
ing alphabetic codes. Our proof explicitly uses the fact that we are only 
constructing general codes. It would be interesting to examine whether it is 
possible to get better bounds for alphabetic codes (or to show that this is 
not possible). 

Another open question concerns Theorem H] in which we showed that if 
E^=i c„^2— ™ < OO, then, 

Ve>0, C{T)-OPT <eOPT + f{C,e). 
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Is it possible to improve this for some general C to get a purely additive 
error rather than a multiplicative one combined with an additive one? 

Finally, in Case 5 of the last section we gave a natural example for which 
the root c of YliZi 2"'^'^™ = 1 exists but for which '^'^^i Cm2~'^'^™ = oo so 
that we can not apply Theorem [J] and therefore have no error bound. It 
would be interesting to devise an analysis that would work for such cases as 
well. 
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